EOCUME NT BIS DUE 



ED 050 031 




AUIHCE 


Eelland , John C. ; 


TITLE 


Analyzing Teacher 
Evaluation of Two 


PUE E ATE 


71 


NOTE 


14p. ; Paper prese 
York, 1971 



SP 004 899 

And Others 

Questions: A Comparative 
Observation Systems. 

ted at A EE A annual meeting. New 



EDES EEICE 
DESCRIPTORS 



IDENTIFIERS 



EDES Price ME-SC.65 HC-$3„29 

^Classroom Observation Techniques, ^Comparative 
Analysis, ^Interaction Process Analysis, 
^Questioning Techniques, ^Reliability , Teacher 
Eehavicr, Validity 

Hough Duncan Observation System, Price Belland 
Observaticn System 



ABSTRACT 



General systems for analyzing instructional 
interaction have found the most common teacher behavior to be 
questions. This evaluation compares and contrasts two systems 
analyzing teacher questions: Price-Belland , developed by the 
from the Blocm-Saunders tradition, and Hough-Duncan, modified 
detailed question analysis. Comparisons were made 1) on the n 
the decision-making process required in coding, 2) on the kin 
irformation derived from each system, and 3) on the interpret 
of data displays derived from each system. In comparing data 
from using the two observation systems simultaneously, signif 
differences were found between systems in the percentage of a 
assigned to five of seven common categories. It is reccmmende 
work be continued on developing a mere reliable questioning-b 
analysis system. (RT) 
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ANALYZING TEACHER QUESTIONS: A COMPARATIVE 

EVALUATION OF TWO OBSERVATION SYSTEMS 



In studying the spate of instructional observation systems developed 
in recent years, it has become apparent that teachers spend a considerable 
proportion of their time asking questions. Several schemes have been 
proposed for analyzing questioning behavior in detail. 

Gall (1970) has recently published a comprehensive review of the 
various methodological traditions and findings in the area of questions 
asked by either teachers or students. He suggested that continuing effort 
should be expended in developing systems which will not only describe the 
kinds of questions which teachers ask, but also the kinds of questions 
which will lead students to the achievement of various educational objectives 
Beyond the examining of individual questions. Gall suggested that sequences 
of questions or questioning strategies might assist that investigation. He 
also suggested that developing analytic instruments designed for specific 
curricular areas would tend to make analyses of the data more productive. 

Very little has been written on the nature of the decision-making 
process used in categorizing questioning behaviors (or any behaviors for 
that matter). In perhaps the bulk of observation systems, the decision is 
made in a fashion like the responding to a ma tch i ng- tes t item. One has a 
list of categories and a series of behaviors to match with them. Sometimes 
the list of categories is arranged along a continuum or is arranged in a 
hierarchical taxonomy. When this occurs, the observer is encouraged to deal 
with behaviors falling between two categories with an arbitrary ground rule. 
Sanders (1966) and Clegg et_ ajk (1967) have proposed observation schemes 
based on the Bloom Taxonomy (1956). The investigators have proposed an 
extension of this tradition (see Figure 1 ) ., 

A few observation systems have relied on the observer making a series 
of very simple decisions with respect to each categorization. In such 
systems a final category is recorded by a series of symbols representing 
the choice at each decision point. These decisions are commonly binary 
in nature although 3- and 4-way decisions also occur and are most like the 
decisions made by digital computers. Tabs (Simon & Boyer, 1970) and Hough- 
Duncan (1970) (see Figure 2) have developed systems based on such decision- 
making trees. 

It would seem likely that the nature of the decision-making process 
would influence the kind of description produced by an observation system. 

In order to make a preliminary investigation of this idea, it was decided 
to categorize a set of audio recordings using the P ri ce- Be 1 1 and system and 
the extension of the Hough-Duncan OS I A system (see Figure 2a). These two 
systems have several final category elements in common and could be adjusted 
to the same measuring unit* each new behavior. 
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Figure 1 

CATEGORIES OF QUESTIONS AND RESPONSES 
PRICE -BELLAND QUESTION ANALYSIS SYSTEM 



Code 


Category 


Teacher 


Student 


X 


SX 


OTHER BEHAVIOR: irrelevant t incomplete, 

reinforcement, criticism, discourse 


1 


SI 


PROCEDURAL: management, encouraging or 

acknowledging person to speak 


2 


S2 


INFORMATION: recognition or recall of 

factual information; opinion 


3 


S3 


TRANSLATION: change of information to 

another communication 


4 


S4 


INTERPRETATION: relationships 


5 


S5 


APPLICATION: general rule to specific 

case; solve a problem 


6 


S6 


ANALYSIS: divide in parts 


7 


S7 


SYNTHESIS: put parts together to pro- 

duce unique communication 


8 


S8 


EVALUATION: judgment; theory 


9 


S9 


AFFECTIVE: feeling, justification, belief 


+ 


S+ 


CLAR T FICATION: repeat, rephrase 


P 




PROBING: follows incorrect answer, no 

answer, or incomplete answer 




- 


INDEFINITE: don't know, maybe, not sure 




0 


SILENCE : no answer 




z 


CONFUSION: interference 



0 
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F i gure 2 

The Observational Sys tem for 1 nstruct ional Ana lysis 



Teacher Behaviors 



Student Behaviors 



Subs tan t i ve 



Appraisal 



Manager i a 1 



S i lence 



Tl Substantive clarification 
T2 Responds to substantive solicitation 
T3 initiates substantive information 
T4 Solicits substantive response 

T5 Corrective feedback 
T6 Conf i mnat ion 
T7 Acceptance 

T8 Positive personal judgment 
T9 Negative personal judgment 

T10 Managerial clarification 
Til Responds to managerial solicitation 
T12 Initiates managerial information 
T 1 3 Solicits managerial response 

Tl4 Silent covert activity 
T 1 5 Silent overt activity 



Teacher or Student Behavior 



51 

52 

53 

54 

s~ 

56 

57 

58 

59 

S i 0 

511 

5 1 2 
SI 3 

s74 
SI 5 



X 16 1 nstructional ly non-functional behavior 
Y 17 Interaction separation designation 



Categories 1-4, and 10-13 may be further categorized as 
a. closed or b. open. 
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There were many other dimensions along which analyses could be made. 

It was determined to informally check whether training i nte rn-teache rs in 
questioning techniques as well as analysis would elicit changes in both 
i ntern- teacher and student behaviors. Specifically: (a) could memory 
questions and responses be reduced, and (b) could i nte rn- teachers ask 
questions to which students could respond appropriately? 

In spite of considerable effort expended by the developers of the pre- 
1970 systems for analysis of teacher questions to disseminate their 
techniques, there has been minimal use of those systems by teachers and 
supervisors. If questioning is an important behavior and if analysis of 
questioning will lead to a more controlled and flexible use of that 
behavior, it is important to develop a system which will make a detailed 
analysis of questioning behavior possible while at the same time be 
sufficiently simple that it can be used by teachers and supervisors. 



Objectives of the Study 

In order to develop a usable but detailed scheme for analyzing 
classroom questions, it was decided to develop an observational system which 
was based on the Bloom-Sanders tradition. Use of this system was compared 
with use of Hough-Duncan in order to evaluate: (a) differences In qualities 

of data, (b) differences in decision-making while categorizing. 



Methods and Technique s 

The steps in this investigation were to: 

1. Develop a system for coding and analyzing questioning 
behavior as noted above 

2. Carry out a pilot study 

3. Code and analyze these same data using the Hough-Duncan 
system 

4. Develop displays of the data 

5. Evaluate the data- 

Pilot Study . The methodology of the pilot study involved the quasi 
experimental one-group pre- and posttest design. The intern-teacher used 
different randomly-selected groups of students for practice so that the 
students received a minimum of practice in responding to the cues of the 
i ntern- teacher. 

First, each i nte rn-teache r randomly assigned the students in his/her 
class into five equal-sized groups* For the purpose of gathering pretest 
data, each of the ten i nte rn-teache rs conducted a questioning session with 
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one randomly selected group of students (Gj) for a total of 15 minutes of 
audio recording. An observer categorized the interaction using the Price- 
Bel land observation system. 

Each teacher then met with the observer to review the training materials 
in this project to facilitate the se 1 f- i ns t ruct ional use of these materials. 

Each i ntern- teacher learned the observation system for recording 
questioning interaction. Each intern-teacher read selections from a 
selected hi? bl iography when necessary for further clari f ication. The 
observation system for recording questioning interaction was considered 
mastered by f;he intern-teacher when the practice tape was categorized, without 
stopping, with four or less responses varying from the categorization 
assigned by the system developers. 

Teachers arranged 4, 20-minute interview situations with groups of 
students not used for the pretest (G 2 j G^, G^, Gj) in which the questioning 
session was audio-taped. The objective of this developmental work with 
different students was two-fold: 

a. The i ntern-teacher was to ask higher-level questions. 

b . The teacher was to elicit h i gher- level responses from the 
students . 

Each teacher recorded the interaction on audiotape, only categorizing the 
last 15 minutes of the teacher and student interaction. This produced s 
total of 1 hour 20 minutes of development work recorded on audiotape, 
but just 1 hour of all interaction was categorized and diagnosed by the 
intern-teachers. The time allowed for this developmental work was 
a two week period. 

For a posttest session, each teacher recorded a questioning session for 
15 minutes on audiotape with the same student group used in the pretest (G|). 
The observer categorized this interaction. 

Observer controls . The observer who categorized the tapes in the 
P ri ce-Be 1 1 and system was one of the designers of that system. Reliability 
was maintained by conferr i ng with the other designers before each categorization 
session and by recategorizing the practice tape. The criterion was the 
complete agreement with the previously agreed upon categories. 

OSI A Coding . The observer who categorized the tapes in the OSIA 
extension wab a student of Hough at The Ohio State University, The 
principal investigator worked with this observer developing the definitions 
of the extended categories and setting out the ground rules. All tapes were 
analyzed by this observer in a two-day period. No quantitative measures of 
reliability were made. The principal investigator has worked with 0S1A since 
1968. If his explanations of the categories were uniform to each observer 
he provided the control necessary to infer that the differences resulted 
from the decis ion-maki ng process. 
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Data displays . The P ri ce-Be 1 1 and data were displayed in the form of a 
percentage profile (see Figure 3)* No attempt was made to preserve sequence 
information in the display. Since the objectives of the training were that 
the i ntern- teacher should be able to reduce the number of memory questions 
and to elicit responses appropriate to the level of the questions, this 
display seemed to be sufficient. 

The data displays for the extended OSIA data were frequency and 
percentage distributions and charting the flow of the questioning (see 
Figure 4). Since these materials were not used to influence the intern- 
teachers, it was deemed unnecessary to develop any of the other displays 
des igned for OSIA. 



Data Sources 



The data were collected as tape-recorded samples of questioning 
behavior from ten i ntern- teachers in May 1970- These teachers were nearing 
the end of a year-long internship as part of the Elementary faster of Arts 
in Teaching Program at the University of North Carolina, Chapel Hill. 



Fi nd i nqs 



Price-Bel land observations of the pilot study . Since there were two 
objectives of the training for i nte rn- teachers in the pilot study, analysis 
of the data follows in that format. 

Hp If i ntern-teachers are trained to recognize memory questions 
and are pla'^d in practice situations designed to decrease the use of 
such questions, they will be. able to conduct a question-asking session 
with a lower percentage of memory questions than they did before the 
training and practice. 

In order to test the nu 1 1 -hypothes i s formed from Hj, the Wilcoxon Matched- 
Pairs Signed-Ranks test was used* When the pre and post percentages of 
memory questions were compared, the Wilcoxon T value was 7- This value is 
significant p<.025 when N=10 in a one-tailed test. Since the median percent 
of memory questions on the pretest was 9 and the median percent of memory 
questions on the posttest was 7 1/4, the null hypothesis was rejected and 
H| confirmed. 

H 2 : If intern-teachers are trained to analyze questions and 

responses, they will be able to ask questions which will elicit 
appropriate category responses. 

This hypothesis was not tested in a formal manner. En Table 1 the direction 
of the shifts in both teacher and student behaviors in each category is listed. 
Out of 15 categories in which change could occur, the median number of 
categories in which the intern-teacher and the students either shifted in 
the same direction or mutually remained constant was 8. The range was from 
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Figure 3 

QUESTION AND RESPONSE PROFILE 
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Tabul at i on 



Per Cent Prof i 1e 



ategory Teacher Student 



# % # % 



0 10 20 30 40 50 60 70 80 90 



X 

1 

2 

3 

4 

5 

6 

7 

8 

9 

+ 

P 

0 

z 

Tota 1 s 



13 14 

22 24 



8 

0 

0 

1 

4 

2 

1 



9 

0 

0 

1 

4 

2 

1 



3 4 
1 1 
30 38 
0 0 



0 

0 

3 

1 

0 



0 

0 

4 

1 

0 



28 30 
10 1 1 



27 33 
7 9 



0 

6 

1 

0 



0 

8 

1 

0 







93 100 79 99 



QUESTION AND RESPONSE PROFILE 



Tabulation 


Per Cent Prof i 1 


Cateqory Teacher Student 


0 10 20 30 40 


# % # % 



50 60 70 80 90 



X 

1 

2 

3 

4 

r 
> 

6 

7 

8 
9 
+ 

P 

0 

z 

Tota 1 s 1 06 



44 42 
12 11 
22 21 
2 2 
7 7 

1 1 
6 6 
3 3 

0 
1 

6 
2 



0 

1 

6 

2 



6 

0 



4 

0 



39 56 
1 1 



9 

0 



8 10 
2 3 



0 

1 

6 

0 

4 

0 

0 

_ZL 



0 

1 

9 

0 

6 

0 

0 
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Figure 4 

Example of OSIA Data Display 



1st Minute 




Randomly Selected Internal Minute 



T4'4 

T4'3 

T4'2 

T4'l 

Tx-y 

Tl 

X 




Final Minute 



T4'4 

T4'3 

T4'2 

T4' 1 

Tx-y 

Tl 

X 
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Changes in Questioning Behavior Before and After Training 
in the P r ice-Bel 1 and Question Analysis System 
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increase from pre to post measures 
decrease from pre to post measures 
no change from pre to post measures 
zero in both measures 



5 to 10. It appeared that this was not indicative of a strong effect so 
H 2 could not be supported by the evidence above. 
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Comparisons between P ri ce-Bel 1 and and QSIA data . In order to make the 
data from the two observation systems comparable, they were grouped into 
seven categories according to the definitions by Hough and Duncan (1970): 

(a) memory, (b) convergent, (c) divergent, (d) evaluation, (e) affective, 
and (f) clarification solicitations; and (g) other behaviors. Table 2 
summarizes the data from the two systems. The hypothesis was that the 
percent of behavior in each category would be different in each system. 

The Wilcoxon Matched-Pairs Signed-Ranks Test was used to test whether there 
was a significant difference in the same direction between the percents in 
a category derived from the two systems. Using the two-tailed test, the 
categories memory, divergent, affective, and other each showed significant 
differences. 

In the categories convergent and clarification, there was no lack of 
difference, but these di fferences occurred in varying directions and were 
ignored by the statistical i^st. In the evaluation category, there were 
so few uses of this behavior that no differences could be measured. 
Generally, in more than half the categories there were significant 
differences in percent of behavior use reported by the two systems. Thus 
if all the assumptions hold, it is likely that the decision-making 
process will influence the category outcomes. 



Recommendations 



Since training and practice in analysis of questions can influence 
question-asking behavior, it seems appropriate to continue to evolve 
observation systems which will yield information about the relationship 
of certain questions and question-asking strategies to educational 
objectives. In this design process, the validity of the products from 
various decision-making strategies will have to be studied. Either one 
of the two systems compared here or some other system must have a closer 
relationship to the reality perceived by the participants than the other 
systems are able to produce. 

The development of a system for analyzing teacher questioning 
behavior and student response which provides: (a) decisive decision- 

making in categorizing, (b) detailed recording of the teacher-student 
interaction, and (c) data easily analyzed and interpreted by teacher, 
supervisor, and researcher would facilitate study of this most-common 
teaching behavior. Hopefully, this comparison of recent systems 
representing two traditions of questioning-behavior analysis will contribute 
to the development of such a system. 
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from the Wilcoxon Matched-Pairs Signed-Ranks Test (Siefel, 1956, pp. 
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